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Abstract 

In this paper we present an algorithm based on the sum-product algorithm that finds elements in the preimage of 
a feed-forward Boolean networks given an output of the network. Our probabilistic method runs in linear time 
with respect to the number of nodes in the network. We evaluate our algorithm for randomly constructed Boolean 
networks and a regulatory network of Escherichia coli and found that it gives a valid solution in most cases. 



Introduction 

In systems and computational biology Boolean networks 
(BN) are widely used to model regulative dependencies of 
organisms [1,2]. We consider networks, which map a set 
of environmental conditions to the presence of proteins 
and finally to actual chemical reactions, which are often 
modeled as fluxes of a flux-balance analysis [3]. Hence, 
these networks are used to make in silico predictions of 
behavior of organisms in a certain environment [4] . 

In this paper we address the inverse problem, i.e., we 
want to predict environmental conditions that allow cer- 
tain reactions to take place, and others not. Hence, in gen- 
eral, we need to find a set of possible inputs that lead to a 
given output. This so called predecessor problem or pre- 
image problem has been addressed by Wuensche in [5] 
and has been shown to NP-hard in general [6], which 
makes it infeasible to solve it for large networks. In [7] an 
algorithm with reduced complexity for BNs with canaliz- 
ing Boolean functions has been introduced. However, the 
problem is infeasible under certain conditions. Both algo- 
rithms are designed to find the whole set of preimages, i.e., 
all inputs to the BN with lead to a certain, desired, output. 

In some applications, knowledge of the whole preimage 
set is not important, merely it can be sufficient to know a 
subset of the preimage set. Here, we propose a probabilis- 
tic algorithm, which solves this problem in linear time 
with respect to the number of nodes in the network, based 
on a variation of the well known Sum-Product Algorithm 
(SPA) [8], which is used for a variety of tasks, including 
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decoding error correction codes in communication engi- 
neering [9]. 

Methods 

Boolean networks and main idea 

We consider networks like shown in Figure 1, mapping 
the values of the N in-nodes I = {1, 2, 3} to the M out- 
nodes O = {12, 13, 14, 15, 16}, i.e., we can represent this 
BN as a function mapping the N input values uniquely 
to the M output values: 

f : {0, 1} N -> {0, 1} M . 

The network itself consists of n nodes, and a set of 
directed edges connecting these nodes. Each node i has 
a certain state, which can be either zero or one, repre- 
sented by a variable x t . Its value is determined by evalu- 
ating a Boolean function (BF) fa. Further, lets define the 
set n(fj) as the incoming nodes of node /, For example 
in Figure 1, n(fs) = {1, 3}. The BF fj is a function map- 
ping kj = \n(fj)\ values of {0, 1}* to {0, 1}, where k is also 
called the in-degree of node /'. The number of edges 
emerging from a node is called out-degree. 

Given a vector of input values x |_ {0, 1} N , x = (x lt x 2 , ■ ■ ■ , 
x N ) the corresponding output of f is y = f (x), y|_ {0, 1} M . In 
general there does not exist a unique inverse function f" 1 . 
Instead the cardinality of the set €l y := {x : f(x) = y} will be 
larger one. We call Cl y the set of preimages of y. In this 
paper we are interested to find at least parts of Cl y . Suppose 
there is a probability distribution P y on {0, 1} N such that 
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— !— if x e Q v 
0 else 



If we knew the probability distribution P y , we would 
have solved the problem. But as explained, this is too 
difficult in general. Our main idea now is to approxi- 
mate P y by the product of the marginal distributions P t 
on the individual x it i.e., 



as the well-known SPA can be used to compute the 
marginals efficiently. If the approximation is good enough 
sampling out the product of the marginals will yield an 
element in Cl y with reasonable probability. 

Proposed algorithm 

In this section we will first discuss the basic principles 
of factor graphs and the SPA. Then we will describe 
the BN as factor graph and will formulate the actual 
algorithm to find the marginals. Finally, the sampling 
is described. 

Factor graphs and sum-product algorithm 

Assume some function g( ) defined on some 

domain A", which can be factorized in m local functions 
hp j e [m] := {1, 2, ... , m}, i.e., 



g(xi, x„) =Y[hj{Xj), 



where Xj is the subset of [n] containing the argument 
of hj . We can then define a factor graph [8] as a bipar- 
tite graph consisting of n nodes representing variables 
\x\, . . . , x„} (variable nodes) and of m nodes represent- 
ing functions {hi, . . . h m } (function node). Edges exist 
between a function node and a variable node if and only 
if Xi is an input to function hj . 

The marginal function gi(x t ) is defined as [8] 



where g ( x v 

^2,g[X\,...,Xn) 



t%n) is defined as 



•(*/! 



In general the computation of the g t is difficult, but 
due to the factorization of g the task can be efficiently 
solved using the the so called Sum-Product Algorithm 
(SPA) [8]. The algorithm iteratively passes messages 
between the nodes of the graph. At each iteration the 
messages fi are sent from the function nodes to the vari- 
able nodes, containing the corresponding marginal 
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function of the local function. These messages are com- 
puted as follows [8]: 

Function to variable node 

~{*J \ yen{h)\{x} J 

where n(i) give the set of neighboring nodes of node i. 

At the variable nodes, these messages are then com- 
bined to a marginal function A and sent back to the 
function nodes [8]: 

Variable to function node 

K-+h[x) = [J flq-+x[x). 
qen(x)\{h) 

The Boolean network as factor graph 

We apply the concept of factor graphs to BNs. Each node 
in the network represents one variable x t e {0, 1}, i e [n] 
of the factor graph, hence we have n variable nodes. Each 
BF f> of the BN (j e [n]\I) is a function node and is con- 
nected to the node / and the incoming nodes nifj). Lets to 
define % as the variables of the incoming nodes of node /', 

i.e. the argument of the BN fi. Further, we define xf ] as Xj 

without the node i. 

Finally, if we consider the variables as each node as 
random variables, we have a common distribution of all 
variables nodes described by the density function, 

&, x„(*l, • • • , Xn) = g(Xi, X„), 

For sake of readability we will omit the subscripts of the 
density function, if they are obvious from context. We are 
interested in finding the marginal distributions of the in- 
nodes, which can be described by the density functions 

&,■(*«) = J2 fa""' *») Vi ' e 1 

This problem is an instance of the problem described 
in Section Factor Graphs and Sum-Product Algorithm, 
hence we apply the same methods here. 

Update rule: function to variable node 

If we focus on one function node j e [n]\I there exists a 
common distribution of all variables relevant for this 
node. Namely, these relevant variables are the ones 
located in Xj of the BF fp and the value of node /. We 
can write the density of this distribution as: 

p{Xj,Xj). 

Lets define h(fj) as the set of indices of the input 
nodes of the BF fj. 



We need to send the local marginal distribution of 
each variable i e {/'} U n(fh back to the variable node, or 
more formally: 

Vi^ifa) = P( x )'ty = P(*i' (1) 

~{Xi) ~{Xi) 

If i = j, i.e. if the message is designated for the node 
containing the output of the BF, the density of the mar- 
ginal distribution becomes: 

~{Xj) 
~{Xj] 

which is the probability distribution of the functions 
output. We can assume that the elements of Xj are pair- 
wise independent, hence we can write: 

p&i) = n x> ^' 

where A/ is the probability distribution of variable 
node / and is defined in Eq. 3. 
In the other cases, i.e., i ^ /, Eq. (1) becomes: 

fij i(xi) = J2 p{xi\Xj,xf)- (Xjjf). 

We still can assume that the elements of Xj are pair- 
wise independent, hence we can write: 

p(Xj, n(fj)\x t )=p(Xj\xf ] y(xf ] ) 

= Pix j \xf ) ) Yi 

len(ff)\{i} 

If the Boolean functions output %j =fj(Xj) is already 

completely determined by X^'\ i.e., if the variable %i has 

no influence on the output for this particular choice of 
the other variables, we assume Xi to be uniformly dis- 
tributed: 

p{ Xi \xj,xf) = -fcjtftxf 0 , Xi)=Xj) 
and since Xj is completely determined by xj 

p(xj,xf)= n kite). 

Ie«(g)\W 

Otherwise, x t is totally determined by Xj and the other 
variables, i.e., is 0 or 1 depending on BF. Hence, we 
can write 

p{xi\Xj, n(fj)\Xi) = p Xj (f{Xj t] , Xi) = Xj), 
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where pxj{f{xj'\ Xi) = Xj) is either 0 or 1. Further we 
can assume Xj independent of xj l \ hence 

p{x 1 ,xf)=k j { Xj ) n \i{x x ). 

len{fj)\{H 

Finally, we can summarize for i * j: 
^ i {x i )=Yj^Px i {f(xf > ,Xi)=x j ) Yl M*l)< (2) 

-{*,) ten(fj)\{i} 



with 



t .. = J| Mfj{xf ] , Xl = o)=f j {x]", Xi = i) 



Update rule: variable to function node 

The update rule is the same for all variable nodes / e [n] 
and is independent of the function node to which they 
are directed. 

where Sj is the set of all function nodes, which have 
node / as input. 
Finding the input distributions 

In our algorithm, we use the well known log-likelihood 
ratio (LLR) to represent the probability distribution of 
binary variables [10]. It is defined as: 



In 



p{x = 0) 



(4) 



A scheme of the algorithm is given in Algorithm 1. 
The probability distribution of each node /' e [«] at 

iteration t is given as and are initialized with ij 0 ' = 0, 
which is equivalent to the uniform distribution. Then we 
set the LLRs for the out-nodes to either -oo or +°o 
depending on the desired output y of the BN. At each 
iteration the algorithm can be split in two steps. The first 
step iterates over all function nodes (j e and all 

input variables i 6 n(fj) calculating the LLR using 
Eq. (2) and Eq. (4). 

In the second step we update all variables-nodes, 
where the LLRs Lj represents the distributions Xj and, 
hence the product of Eq. 3 becomes a summation. 
Please note, that the LLR of the previous iteration is 
also added to the sum, in order to prevent rapid 
changes of the distributions. 

After performing a certain number of iterations t max , 
the desired marginal distributions of the input variables 
are found. 



Algorithm 1 

Initialize L? 0 -' 



0 for all nodes 



Set the desired LLRs of the out-nodes, i.e., Lj 0 ' is 
either -oo or +°o, for all out-nodes j e O. 
t = 0 
repeat 
t=t+l 

for each non-in-node (j € [n]\I)do 
for each input variable i e h(fj) do 

calculate L y _>; using Eq. (2) and Eq. (4) 
end for 

end for 

for each non-out-node v do 



L (I) = L 
end for 



(t-i) 



E 



r(0 



until maximum number of iterations reached 
Sampling 

The sampling part of our approach is straightforward. 
Using the marginal distributions Lj"\ j € I we ran- 
domly draw vectors x and check if they fulfill y = fix). If 
so, they are added to the set f2 y . This procedure is 
repeated for a certain number of samples. 

Simulation results and discussion 

We tested our algorithm with randomly generated net- 
works and the regulatory network of Escherichia coli (E- 
coli) [2]. The random networks consist of 2400 nodes with 
N = 200 and M = 1200. We have chosen the BFs from: 

• all functions with k < 15 (Type A) 

unate, i.e. locally monotone, functions with k < 15 
(Type B) 

After generating a network we draw a certain number 
T of uniformly distributed input vectors x and obtain y 
= f(x). For each y we applied then Algorithm 1 to obtain 
the marginal distributions L^"""\j e I . To investigate the 
convergence behavior with respect to t max we first apply 
hard-decision to evaluate a good choice for t max , i.e., we 
generate an estimate x by setting 



0 if Lj w) 

1 if 



> 0 
< 0 



Then we evaluate the network y = f(x) , and measure 
the similarity between y and y by counting the equal 
entries and divide them by the length of y. We did so 
for 100 networks of Type A and B, and set T = 100. 
The averaged results can be seen in Figure 2. 

One can see, that for t max S 14 there is almost no 
improvement in the similarity. This number is equal to 
two times the number of nodes between input and out- 
put, i.e., it seems to be sufficient that the messages travel 
once through the network and back. Thus, the following 
simulations have been perform setting t max = 14. 
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Figure 2 Similarity of y and y versus f„ 



Next, we apply sampling as described in Section Sam- 
pling. We did so for 100 different networks of Type A 
and B, and the E-coli network. For each random net- 
work we did T = 100 runs, for E-coli T = 1000. The 
results can be viewed in Table 1. We depict the percen- 
tage of solved networks, i.e. the portion of networks we 
found at least one valid x e D. y . Further, we give the 
average number of valid x and the average number of 
unique x. 

One can see from the results, that in general for most 
networks and ys at least one preimage can be found. It is 
worth mentioning, that for the E-coli network every 
sampled solution was unique. This is due to the fact, that 
there exist a few inputs, who completely determine the out- 
put. The other input variables have then no influence and 
hence a marginal distribution of 0.5. Further, the results for 
the network of type B are much better than for type A. It 
seems that the marginal distributions for unate functions 
give better estimation of the actual distribution than the 
marginal distributions for non-unate functions. 

Conclusions 

In this work, we proposed a probabilistic algorithm to 
address the preimage problem of Boolean networks. This 
is of interest when designing experiments, in which certain 
regulators are supposed to be in a specific state. Perform- 
ing a series of simulations with Random networks we 



Table 1 Simulation results for different networks 



network 


num of samples 


solved 


valid 


Unique 


Type A 


1000 


89% 


608.81 


4.43 


Type B 


1000 


95.9% 


270.74 


68.60 


E-coli 


1000 


98.6% 


193.3 


193.3 



showed, that the algorithm works not only for unate func- 
tions, of which most biologically motivated networks con- 
sist, but for any kind of Boolean functions. By replacing 
the fixed output values of the network by probabilities one 
can simply apply the algorithm to networks, whose desig- 
nated output is described by probability distributions. 
Further, the algorithm may be easily adjusted to work on 
stochastic, e.g. Bayesian, networks, where the nodes con- 
tain only transition probabilities instead of Boolean func- 
tion. Therefore, it is needed to adapt the update rules 
accordingly. It remains an open question, which influence 
topographic properties, such as number of layers and 
number of nodes in these layers, have to the performance 
of the proposed algorithms, since we only investigated net- 
works which are similar to the regulatory network of 
E-coli. 
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